You should submit a knitted pdf file on Moodle, but be sure to show all of your R code, in addition to your output, plots, and written responses.
as_tibble may be helpful, but you’ll want to make sure there’s no duplicate variable names before converting a data set into a tibble.Further hints:
Hello! I’m making a change!
2 + 2
## [1] 4
Now I’m making a change from GitHub!!!
rvest package to pull off data from the link here with the top 50 grossing films from 2018. Generate a tibble that contains the title, gross, star rating (imdbscore), and metascore for the top 50 films. Then create a scatterplot of star rating versus Gross. A couple of hints:Identify which films of the top 50 from 2018 had the biggest discrepancy between reviewers (metascore) and viewers (star rating).
5 points if you push your Rmd file with HW15 solutions along with the knitted pdf file to your MSCS264-HW15 repository in your GitHub account. So that I can check, make your repository private (good practice when doing HW), but add me (username = lfbv) as a collaborator under Settings > Collaborators.
vaccine_data <- read_csv("Data/exam1data.csv")
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## State = col_character(),
## Date = col_date(format = ""),
## people_vaccinated = col_double(),
## total_distributed = col_double(),
## share_doses_used = col_double(),
## people_vaccinated_per100 = col_double(),
## Governor = col_character(),
## Region = col_character(),
## month0 = col_double(),
## day0 = col_double(),
## year0 = col_double(),
## est_population = col_double(),
## dist_per_person = col_double(),
## prev_day = col_double(),
## daily_vaccinated = col_double()
## )
vacc_mar13 <- vaccine_data %>%
filter(Date =="2021-03-13") %>%
select(State, Date, people_vaccinated_per100, share_doses_used, Governor) %>%
mutate(State = str_replace(State, " State", ""),
State = str_to_lower(State))
library(viridis) # for color schemes
## Loading required package: viridisLite
library(maps)
##
## Attaching package: 'maps'
## The following object is masked from 'package:purrr':
##
## map
map_data("state") %>%
left_join(vacc_mar13, by =c("region" = "State")) %>%
ggplot(mapping = aes(x = long, y = lat,
group = group)) +
geom_polygon(aes(fill = people_vaccinated_per100), color = "black") +
labs(fill = "People Vacc.\nper 100 pop.") +
coord_map() + # This scales the longitude and latitude so that the shapes look correct.
theme_void() + # This theme can give you a really clean look!
scale_fill_viridis() + # you can change the fill scale for different color schemes.
labs(title = "Cumulative People Vaccinated per 100 population\nMarch 13, 2021")
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
(vaccine_data %>%
group_by(Region, Date) %>%
summarize(people_vacc_total = sum(people_vaccinated_per100)) %>%
ggplot(mapping = aes(x = Date, y = people_vacc_total, color = Region)) +
geom_point() +
geom_line()+
labs(title = "Cumulative People Vaccinated per 100 population",
y = "People/100 Population",
x = "Date")) %>%
ggplotly()
## `summarise()` has grouped output by 'Region'. You can override using the `.groups` argument.
library(leaflet)
airbnb.df <- read_csv("Data/airbnbData_full.csv")
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## .default = col_double(),
## Title = col_character(),
## baseurl = col_character(),
## AboutListing = col_character(),
## HostName = col_character(),
## MemberDate = col_character(),
## BookInstantly = col_character(),
## Cancellation = col_character(),
## P_Cleaning = col_character(),
## P_Deposit = col_character(),
## P_ExtraPeople = col_character(),
## P_Monthly = col_character(),
## P_Weekly = col_character(),
## R_CI = col_character(),
## R_acc = col_character(),
## R_clean = col_character(),
## R_comm = col_character(),
## R_loc = col_character(),
## R_val = col_character(),
## RespRate = col_character(),
## RespTime = col_character()
## # ... with 7 more columns
## )
## ℹ Use `spec()` for the full column specifications.
## Warning: 4 parsing failures.
## row col expected actual file
## 1961 S_Accomodates a double Not Found 'Data/airbnbData_full.csv'
## 1961 S_NumBeds a double Not Found 'Data/airbnbData_full.csv'
## 1993 S_Accomodates a double Not Found 'Data/airbnbData_full.csv'
## 1993 S_NumBeds a double Not Found 'Data/airbnbData_full.csv'
Encoding( x = airbnb.df$AboutListing ) <- "UTF-8"
airbnb.df$AboutListing <-
iconv( x = airbnb.df$AboutListing
, from = "UTF-8"
, to = "UTF-8"
, sub = "" )
# This part makes the map!
leaflet() %>%
addTiles() %>%
setView(lng = mean(airbnb.df$Long), lat = mean(airbnb.df$Lat),
zoom = 13) %>%
addCircleMarkers(data = airbnb.df,
lat = ~ Lat,
lng = ~ Long,
popup = ~ AboutListing,
radius = ~ S_Accomodates, # These last options describe how the circles look
weight = 2,
color = "red",
fillColor = "yellow")